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Abstract. Modern functional- logic programming languages like Toy or 
Curry feature non-strict non-deterministic functions that behave under 
call-time choice semantics. A standard formulation for this semantics is 
the CRWL logic, that specifies a proof calculus for computing the set 
of possible results for each expression. In this paper we present a for- 
malization of that calculus in the Isabelle/HOL proof assistant. We have 
proved some basic properties of CRWL: closedness under c-substitutions, 
polarity and compositionality. We also discuss some insights that have 
been gained, such as the fact that left linearity of program rules is not 
needed for any of these results to hold. 

1 Introduction 

Fully formalizing the (meta)theory of a programming language can be beneficial 
for developing its foundations. There is an increasing number of researchers (see 
e.g. [2]) sharing the conviction that the combination formalization+mechanized 
theorem proving must (and will) play a prominent role in programming languages 
research and technology. In particular, formalizations help to clarify overlooked 
aspects, to discover pitfalls, and even to provide new insights; moreover, formal- 
ized mctathcorics lead to mechanized reasoning about programs, giving reliable 
support to tools like certifying compilers or certified program transformations. 

In this paper we formalize the semantics of functional logic programming 
(FLP), a well established paradigm (see [9]) integrating features of logic and 
functional languages. In modern FLP languages such as Curry [10] or Toy [14] 
programs are constructor based rewrite systems that may be non-terminating 
and non-confluent. Semantically this leads to the presence of non-strict and 
non-deterministic functions. The semantics adopted for non-determinism is call- 
time choice [11,8], informally meaning that in any reduction, all descendants 
of a given subexpression must share the same value. The semantic framework 
CRWL 3 was proposed in [7, 8] to accomodate this view of non-determinism, and 
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3 CRWL stands for "Constructor-based Re Writing Logic". 



is nowadays considered the standard semantics of FLP. For the purpose of this 
paper, the most relevant aspect of CRWL is a proof calculus devised to prove 
reduction statements of the form V h e — > t, meaning that t is a possible 
(partial) value to which e can be reduced using the program V . 

We have chosen Isabellc/HOL as concrete logical framework for our formal- 
ization. Using such a broadly used system is not only easier, but also more 
flexible and stable than developing language specific tools like has been done, 
e.g., for logic programming [15] or functional programming [6]. 

The remainder of the paper is organized as follows: Sect. 2 contains some 
preliminaries about the CRWL framework , Sect. 3 presents the Isabelle theories 
developed to formalize CRWL, and Sect. 4 gives the mechanized proofs of some 
important properties of CRWL. Finally, Sect. 5 summarizes some conclusions 
and points to future work. 

An extended version of this paper can be found at http : / / gpd . sip . ucm . es/ 
juanrh/pubs/ isabell-crwl-report . pdf . The Isabelle code underlying the re- 
sults presented here is available at https://gpd.sip.ucm.es/trac/gpd/wiki/ 
GpdSy st ems/I sabelleCrwl. 

2 Preliminaries 

2.1 Constructor-based term rewrite systems 

We consider a first-order signature £ = CS U FS, where CS and FS are two 
disjoint sets of constructor and defined function symbols respectively, each with 
associated arity. We write CS n (FS n resp.) for the set of constructor (function) 
symbols of arity n. The set Exp of expressions is inductively defined as 

Exp 3 e ::= X \ h(ei, . . . , e„), 

where X G V, h G CS n U FS n and e 1 ,...,e n G Exp. The set CTerm of con- 
structed terms (or c-terms) is defined like Exp, but with h restricted to CS n (so 
CTerm C Exp). The intended meaning is that Exp stands for evaluable expres- 
sions, i.e., expressions that can contain function symbols, while CTerm stands 
for data terms representing values. We will write e, e', . ■ ■ for expressions and 
t, s, . . . for c-terms. The set of variables occurring in an expression e will be 
denoted as var(e). We will frequently use one-hole contexts, defined as 

Cntxt 3 C ::= [ ] | h{e\, . . . ,C, . . . , e„) 

for h G CS" U FS". The application of a context C to an expression e, written 
C[e], is defined inductively by 

[][e] = e and ft(ei, . . . ,C, . . . , e n )[e] = h{e x , . . . ,C[e], . . . , e n ). 

The set Subst of substitutions consists of finite mappings 9 : V — > Exp (i.e., 
mappings such that 9{X) ^ X only for finitely many X G V), which extend 
naturally to 9 : Exp — > Exp. We write e9 for the application of 9 to e, and 99' 
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Fig. 1. Rules of CWL 



for the composition of substitutions, denned by X(99') = (X9)8' . The domain 
of 9 is defined as dom(9) = {X £ V | ^ X}. In most cases we will use 
c- substitutions 9 £ C'Subst, for which X9 £ CTerm for all X S dom{9). 

A CRWL-program (or simply a program) is a set of rewrite rules of the 
form /(£) — > e where / € FS n , e £ Exp and £ is a linear n-tuple of c-terms, 
where linearity means that each variable occurs only once in t. Notice that we 
allow e to contain extra variables, i.e., variables not occurring in t. CRWL- 
programs often allow also conditions in the program rules. However, CRWL- 
programs with conditions can be transformed into equivalent programs without 
conditions, therefore we consider only unconditional rules. 

2.2 The CRWL framework 

In order to accomodate non-strictness at the semantic level, we enlarge £ with a 
new constant constructor symbol _L. The sets Exp±, CTerm±, Subst^, CSubst± 
of partial expressions, etc., are defined naturally. Notice that _L does not appear 
in programs. Partial expressions are ordered by the approximation ordering C 
defined as the least partial ordering satisfying 

ICe and e C e =>■ C[e] C C[e'] for all e, e £ Exp±,C £ Cntxt 

This partial ordering can be extended to substitutions: given 9, a £ Subst± we 
say 9 C a if X9 C Xa for all X £V. 

The semantics of a program V is determined in CRWL by means of a proof 
calculus (see Fig. 1) for deriving reduction statements V \~ e — > t, with e £ Exp± 
and t £ CTerm±, meaning informally that t is (or approximates) a possible value 
of e, obtained by iterated reduction of e using V under call-time choice. Rule B 
(bottom) allows us to avoid the evaluation of any expression, in order to get a 
non-strict semantics. Rules RR (restricted reflexivity) and DC (decomposition) 
allow us to reduce any variable to itself, and to decompose the evaluation of 
an expression whose root symbol is a constructor. Rule OR (outer reduction) 
expresses that to evaluate a function call we must first evaluate its arguments 
to get an instance of a program rule, perform parameter passing (by means of a 
CSubst± 9) and then reduce the instantiated right-hand side. The use of partial 
c-substitutions in OR is essential to express call-time choice, as only single partial 
values are used for parameter passing. Notice also that by the effect of 9 in OR 



extra variables in the right-hand side of a rule can be replaced by any c-term, 
but not by any expression. The C'RWL- denotation of an expression e S Expj_ is 
defined as {ef = {t£ CTerm ± | V \- C rwl e -> t}. 

3 Formalizing CRWL in Isabelle 
3.1 Basic definitions 

We describe our formalization of CRWL in Isabelle. The first step is to define 
elementary types for the syntactic elements. 

datatype signat = fs string I cs string 
datatype varld = vi string 

datatype exp = perp I Var varld I Ap signat "exp list" 
types 

subst = "varld exp option" 
rule = "exp * exp" 
program = "rule set" 

Signatures are represented by a datatype that provides two constructors cs and 
f s to distinguish between constructor and function symbols. The type varld is 
used to represent variable identifiers, which will be employed to define substitu- 
tions. Then the datatype exp is naturally defined following the inductive scheme 
of Exp±, therefore with this representation every expression is partial by default. 

Substitutions (type subst) are represented as partial functions from vari- 
able identifiers to expressions, using Isabcllc's option type. Hence the domain 
of a substitution $ will be the set of elements from varld for which $ returns 
some value different from None. Note that this representation does not ensure 
that domains of substitutions are finite. Our proofs do not rely on this finite- 
ness assumption. Finally we represent a program rule as a pair of expressions, 
where the first element is considered the left-hand side of the rule and the sec- 
ond the right-hand side, and a program simply as a set of program rules. The 
set of valid CRWL programs is characterized by a predicate crwlProgram : : 
"program => bool" that checks whether the restrictions of left-linearity and 
constructor discipline are satisfied. 

We define a function apSubst : : "subst => exp => exp" for applying a 
substitution to an expression. The composition of substitutions is defined through 
a function substComp : : "subst => subst =>• subst". The following lemma 
ensures the correctness of this definition, 
lemma subsCompAp : 

" (apSubst ■& (apSubst a e)) = (apSubst (substComp $ a) e)" 

Just as ML, the Isabelle type system does not support sub typing, which could 
have been useful to represent the sets of c-terms and c-substitutions. Instead, 
we define predicates cterm and csubst characterizing these subtypes. We prove 
the expected lemmas, such as that the composition of two c-substitutions is a 
c-substitution, or that the application of a c-substitution to a c-term yields a 
c-term. 



3.2 Approximation order and contexts 



Two key notions of GRWL have not yet been formalized: the approximation 
order Z, which will be used in the formulation of the polarity of CRWL, and the 
notion of one- hole context, which will be used in the compositionality. 

The following inductively defined predicate ordap (with concrete infix syntax 
Z) models the approximation order. 

inductive 

ordap :: "exp exp => bool" ("_ C _" [51,51] 50) 
where 

B: "perp Z e" 
I V: "Var x C Var x" 

I Ap: "[ size es = size es' ; ALL i < size es. es!i Z es'!i ] 
=>■ Ap h es Z Ap h es'" 

Rule B asserts that perp Z e holds for every e; rule V is needed for Z to be 
reflexive; finally rule Ap ensures closedness under Z'-operations, and thus com- 
patibility with context [3], because Z is reflexive and transitive, as we will see. 
The following results state that our formulation of Z defines a partial order. 

lemma ordapRefl : "e Z e" 
lemma ordapTrans : 

assumes "el Z e2" and "e2 Z e3" 

shows "el Z e3" 
lemma ordapAntisym : 

assumes "el Z e2" and "e2 Z el" 

shows "el = e2" 
definition ordap_less ("_ Z _" [51,51] 50) where 

"eCe'seZe'Ae^e" 1 
interpretation exp : order [ordap ordap_less] 

Contexts are represented as the datatype cntxt, defined as follows: 

datatype cntxt = Hole I Cperp I CVar varld 
I CAp signat "cntxt list" 

Note that cntxt cannot follow the inductive structure of Cntxt with precision, 
because the type system of Isabelle is not expressive enough to allow us to 
specify that only one of the arguments of CAp will be a context and the others 
will be expressions. Then our contexts are defined as expressions with possibly 
some holes inside. Therefore the datatype cntxt represents contexts with any 
number of holes, even zero holes, and the function apCon :: "exp =>■ cntxt =>■ 
exp" is defined so it puts the argument expression in every hole of the argument 
context. In order to characterize contexts with just one hole, we define a function 
numHoles : : "cntxt nat" that returns the numbers of holes in a context. 
Using it we can define define predicates oneHole and noHole and prove the 
following lemmas. 



lemma noHoleApDontCare : 
assumes "noHole xC" 
shows "apCon e xC = apCon e' xC" 

lemma oneHole : 

assumes "oneHole (CAp h xCs) " 

shows "3 xC yCs zCs. xCs = (yCs xC # zCs) A oneHole xC A 

(Vc £ set (yCs @ zCs) . noHole c) " 



3.3 The CRWL logic in Isabelle/HOL 



The CRWL logic has been formalized through the inductive predicate clto with 
infix notation "_ h _ — > _". The rules defining clto faithfully follow the in- 
ductive structure of the definition of CRWL as it is presented in Fig. 1. 
inductive 

clto : : "program =>■ exp =>■ exp =>■ bool" ("_ h _ — > _" 
[100,50,50] 38) 
where 

B[intro] : 
I RR[intro] 
I DC[intro] 



prog r exp — > perp 
"prog h Var v — > Var v" 
"[size es = size ts; 

Vi < size es. prog h es!i 
] prog h Ap (cs c) es — 

I 0R[intro] : "[(Ap (fs f) ps, r) £ prog ; csubst i9 
size es = size ps ; 
Vi < size es . prog h es!i 
prog h apSubst ■& r — > t 
] =^> prog h Ap (fs f) es -» t" 

Using clto we can easily define the CRWL denotations in Isabelle as follows, 
definition den : : "program =>• exp => exp set" where 
"den Pe={t. P h e ^ t}" 



-> ts!i 

Ap (cs c) ts" 



apSubst 1} (ps!i); 



4 Reasoning about CRWL in Isabelle 

The first interesting property that we are proving about CRWL expresses that 
evaluation is closed under c- substitutions: reductions are preserved when terms 
are instantiated by c-substitutions. 
theorem crwlClosedCSubst : 

assumes "prog h e — > t" and "csubst t?" 

shows "prog h apSubst i? e — > apSubst i? t" 

The proof of this lemma proceeds by induction on the CRWT^proof of the hy- 
pothesis, therefore we will have one case for each CRWL rule. The first three 
cases are proved automatically. However, to prove the case for rule OR Isabelle 
needs some help from us. We need to prove 



prog h (Ap (fs f) (map (apSubst ■&) es)) — > (apSubst # t) 



and then let the simplifier apply the definition of apSubst. In the proof for 
that subgoal we used lemma CSubsComp to ensure that the c-substitution /i used 
for parameter passing composed with the c-substitution ?9 in the hypothesis 
yields another c-substitution, and lemma subsCompAp to guarantee the correct 
behaviour of the composition for those c-substitutions. 

Note that for this result to hold no additional hypotheses about the program 
or the expressions involved are needed. In particular, this implies that the result 
holds even for programs that do not follow the constructor discipline or that 
have non left-linear rules. The Isabelle proof clearly shows that the important 
ingredients are the use of c-substitutions for parameter passing and the rcflcxivity 
of CRWL wrt. c-terms, expressed by lemma ctermRef 1, which allows us to reduce 
to itself any expression Xi9 coming from a premise X — > X. 

The second property that we address is the polarity of CRWL. This property 
is formulated by means of the approximation order and roughly says that if we 
can compute a value for an expression then we can compute a smaller value 
for a bigger expression. Here we should understand the approximation order 
as an information order, in the sense that the bigger the expression, the more 
information it gives, and so more values can be computed from it. 

theorem crwlPolarity : 

assumes "prog h e — > t" and "e C e'" and "f C t" 

shows "prog h e' — > t'" 
using assms proof (induct arbitrary: e' t') 

The idea of the proof is to construct a CRWT^proof for the conclusion from the 
CRWI^prooi of the hypothesis, hence it is natural to proceed by induction on 
the structure of this proof (method induct). The qualifier arbitrary is used 
to generalize the assertion for any expressions e ' and t ' . The proof also relics 
on the following additional lemmas about the approximation order, which were 
proved automatically by Isabelle. 

lemma ordapPerp: assumes "e C perp" shows "e = perp" 
lemma ordapVar: assumes "Var v C e" shows "e = Var v" 
lemma ordapVar_converse : 

assumes "e C Var v" shows "e = perp V e = Var v" 
lemma ordapAp: 

assumes "Ap h es C e'" 

shows "3es'. e' = Ap h es' A size es = size es' 
A (ALL i < size es. es!i C es'!i)" 
lemma ordapAp_converse : 
assumes "e' C Ap h es" 
shows "e' = perp V 

(3es'. e' = Ap h es' A size es = size es' 

A (ALL i < size es. es'!i C es!i))" 

The inductive proof for theorem crwlPolarity again considers each CRWL 
rule in turn. In the case for B we have t = perp, hence we just have to apply 
ordapPerp to get t ' = perp, and then use the CRWL rule B. Regarding RR, as 



then t = Var v, by ordapVar_converse we get that either t' = perp or t ' = 
Var v. The first case is trivial, and in the latter we just have to apply ordapVar 
getting e' = Var v, which is enough for Isabelle to finish the proof automat- 
ically. The case of DC is more complicated. Again we obtain two cases for t ' 
= perp and t' a constructor application, by using lemma ordapAp_converse. 
While the first case is trivial, the second one requires some involved reasoning 
over the list of arguments, using the information we get from applying lemma 
ordapAp. Finally, the proof for OR is similar to the second case of the proof for 
DC, with a similar manipulation of the list of arguments, and the use of lemma 
ordapAp to obtain the induction hypothesis for the arguments. 

Once again we find that this proof does not require any hypothesis on the 
linearity or the constructor discipline of the program: this is indeed quite obvious 
because this property only talks about what happens when we replace some 
subexpression by perp. 

Finally we will tackle the compositionality of CRWL, that says that if we 
take a context with just one hole and an expression, then the set of values for 
the expression put it that context will be the union of the set of values for the 
result of putting each value for the expression in that context. 

theorem compCRWL : 

assumes "oneHole xC" 
shows "den P (apCon e xC) = 

([Jteden P e. den P (apCon t xC))" 

We have proved the two set inclusions separately as auxiliary lemmas compCRWLl 
and compCRWL2. The proofs of these lemmas are quite laborious but essentially 
proceed by induction on the Ci?WX-proof in their hypothesis, using it to build 
a CRWL-prooi for the statement in the conclusion. In these proofs, Lemma 
noHoleApDontCare from Subsect. 3.2 is fundamental. 

Again, while theorem compCRWL requires the context to have just one hole, 
it does not assume the linearity or constructor discipline of the program. This 
came as a surprise to us, and initially made us doubt about the accuracy of our 
formalization of CRWL. But it turns out that although CRWL is designed to 
work with CRWX-programs, that fulfil these restrictions, it can also be applied 
to general programs. For those programs some properties, such as the theorems 
crwlClosedCSubst, crwlPolarity, and compCRWL still hold, but other funda- 
mental properties do not, in particular the strong adequacy results w.r.t. its op- 
erational counterparts of [8,12,1]. The point is that for those programs CRWL 
does not deliver the "intended semantics" anymore. And this is not strange, be- 
cause that semantics was intended with CR IVX-programs in mind. For example, 
consider the non linear program V = {f(X,X) — ► a}. There is a CftWL-proof 
for the statement V h f(a, b) —> a but this value cannot be computed in any 
of the operational notions of [8, 12, 1] nor in any implementation of FLP, in 
which the independence of the matching process of the arguments — derived 
from left-linearity of program rules — is assumed. It is also not very natural 
that /(a, b) could yield the value a for the arguments a and b being different 
values, which implies that the semantics defined by CRWL for non left-linear 



programs is pretty odd. But that is not a big problem, because we only care 
about the properties of CRWL for the kind of programs it has been designed 
to work with. And if it enjoys some interesting properties for a bigger class of 
programs that is fine, because that nice properties will be inherited by the class 
of CR WX-programs. 

On the other hand, for programs not following the constructor discipline, we 
will not even be able to have a matching for an argument of a rule which is not 
a constructor, because in the rule OR we have to reduce every argument of a 
function call to a value, which will be a c-term by Lemma ctermVals (see the 
extended version of this paper), and so could never be an instance of expression 
containing function symbols. Thus, the rule OR could not be used for program 
rules not following the constructor discipline. 



5 Conclusions 

This paper presented a formalization of the essentials of CRWL [7,8], a well- 
known semantic framework for functional logic programming, in the interactive 
proof assistant Isabelle/HOL. We chose that particular logical framework for 
its stability and its extensive libraries. The Isar proof language allowed us to 
structure the proofs so that they become quite elegant and readable, as can be 
observed by looking at the Isabelle code. 

Our formalization is generic with respect to syntax, and includes important 
auxiliary notions like substitutions or contexts. This is in contrast to previous 
work [4, 5] that focused on formalizing the semantics of each concrete program. 
In contrast, our paper focuses on developing the metatheory of the formalism, 
allowing us to obtain results that are more general and also more powerful: we 
formally prove essential properties of the paradigm like polarity or composition- 
ality of the CKW/X-semantics. We plan to extend our theories so that we will be 
able to reason about properties of concrete programs by deriving theorems that 
express verification conditions in the line of those stated in [4, 5]. 

While developing the formalization we realized an interesting fact not pointed 
out before: properties like polarity or compositionality do not depend on the 
constructor discipline and left-linearity imposed to programs. However, such 
requirements will certainly play an essential role when extending our work to 
formally relate the CR FFL-semantics with operational semantics like the one 
developed in [12], one of our intended subjects of future work. We think that 
could be interesting in several ways. First of all it would be a further step in 
the direction of challenge 3 of [2] , "Testing and Animating wrt the Semantics" , 
because we would end up getting an interpreter of CRWL during the process. We 
should then also formalize the evaluation strategy for the operational semantics, 
obtaining an Isabelle proof of its optimality. Finally there are precedents [13, 12] 
of how the combination of a denotational and operational perspective is useful 
for general semantic reasoning in FLP. 
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